Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures

نویسندگان

چکیده

Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Use/Land Cover has been topic interest for remote sensing community several years, but it is still fraught with technical challenges. One such challenge imbalanced nature most remotely sensed data. The asymmetric class distribution impacts negatively performance classifiers adds new source error these maps. In this paper, we address learning problem, by using K-means Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. SMOTE improves quality newly created artificial data addressing both between-class imbalance, traditional oversamplers do, also within-class avoiding generation noisy while effectively overcoming imbalance. compared three popular methods (Random Oversampling, Borderline-SMOTE) seven benchmark datasets, (Logistic Regression, K-Nearest Neighbors Random Forest Classifier) evaluation metrics five-fold cross-validation approach different initialization seeds. statistical analysis results show that proposed method consistently outperforms remaining producing higher land classifications. These suggest LULC can benefit significantly from use more sophisticated spectral signatures same vary according geographical distribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification a...

متن کامل

Oversampling Method for Imbalanced Classification

Classification problem for imbalanced datasets is pervasive in a lot of data mining domains. Imbalanced classification has been a hot topic in the academic community. From data level to algorithm level, a lot of solutions have been proposed to tackle the problems resulted from imbalanced datasets. SMOTE is the most popular data-level method and a lot of derivations based on it are developed to ...

متن کامل

Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In th...

متن کامل

RBM-SMOTE: Restricted Boltzmann Machines for Synthetic Minority Oversampling Technique

The problem of imbalanced data, i.e., when the class labels are unequally distributed, is encountered in many real-life application, e.g., credit scoring, medical diagnostics. Various approaches aimed at dealing with the imbalanced data have been proposed. One of the most well known data pre-processing method is the Synthetic Minority Oversampling Technique (SMOTE). However, SMOTE may generate ...

متن کامل

Improving SMOTE with Fuzzy Rough Prototype Selection to Detect Noise in Imbalanced Classification Data

In this paper, we present a prototype selection technique for imbalanced data, Fuzzy Rough Imbalanced Prototype Selection (FRIPS), to improve the quality of the artificial instances generated by the Synthetic Minority Over-sampling TEchnique (SMOTE). Using fuzzy rough set theory, the noise level of each instance is measured, and instances for which the noise level exceeds a certain threshold le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information

سال: 2021

ISSN: ['2078-2489']

DOI: https://doi.org/10.3390/info12070266